PROCEEDINGS 


OF THE 


NATIONAL ACADEMY OF SCIENCES 


Volume 30 February 15, 1944 Number 2 
Copyright 1944 by the National Academy of Sciences 


RELATIVE NUMBER OF NON-INVARIANT OPERATORS IN A 
GROUP 


By G. A. MILLER 


DEPARTMENT OF MATHEMATICS, UNIVERSITY OF ILLINOIS 
Communicated December 30, 1943 


Since the number of the invariant operators of a given group G, of 
order g is equal to the order of the central c of G, the number of the non- 
invariant operators of Gis g—c. In every non-abelian group the central 
quotient group is known to be non-cyclic and hence its order is a com- 
posite divisor of the order of G. In particular, when the order of G is the 
product of two distinct prime nurnbers and G is non-abelian the number 
of the non-invariant operators of G is g — 1, and when the order of the non- 
abelian group G is p*, p being a prime number, then the number of the non- 
invariant operators of G is p> — p= p(p? — 1). A non-abelian group of 
order g can therefore not contain less non-invariant operators than g minus g 
divided by the two smallest positive integers which divide the order of g. 

From the preceding paragraph it results that a non-abelian group of 
order g cannot have less than 3g/4 non-invariant operators and that it 
must involve a larger number of such operators when g is not divisible by 4. 
In fact, it must involve a larger number of such operators unless g is di- 
visible by 8 as we proceed to prove. Suppose that G contains exactly 
3g/4 non-invariant operators and hence its central quotient group is the 
four group. Its central will then give rise to three co-sets corresponding 
to the three operators of order 2 in the four group. If s; is an operator 
from one of these co-sets and s2 is an operator from another of these three co- 
sets then s; and 5s; are necessarily non-commutative sines; otherwise they 
would appear in the central of G. 

A commutator which arises from s, and s; must be in the central of G 
and it must be of order 2 since the squares of all of the operators of 
G must appear in the central of G. It therefore results that the central of G 
is of even order and hence the order of G is divisible by 8. It is obvious that 
each of the two non-abelian groups of order 8 satisfies the condition that 
the number of the non-invariant operators contained therein is 3g/4, where 





26 MATHEMATICS: G. A. MILLER Proc. N. A. S. 


g is the order of the group. Moreover the direct product of one of these 
groups of order 8 and an arbitrary abelian group evidently satisfies the 
same condition. Hence there results the following theorem: A meces- 
sary and sufficient condition that there is at least one group of order g in which 
the number of the non-invariant operators is exactly 3g/4 is that g is divisible 
by 8 and there is no non-abelian group of order g which contains a smaller 
number of non-invariant operators. This theorem can clearly be extended. 


When the number of the non-invariant operators of G is exactly 3g/4 
all the operators of odd order contained in G must appear in its central. 
Hence such a G is the direct product of its Sylow subgroups and all of its 
Sylow subgroups of odd order are abelian while its Sylow subgroup whose 
order is a power of 2 has a commutator subgroup of order 2 as was noted 
above. The determination of the groups which involve exactly 3g/4 
non-invariant operators is therefore reduced to the determination of such 
groups having an order which is a power of 2. In particular, if a group 
whose order is not divisible by 16 has the property that it contains exactly 
3g/4 non-invariant operators it is the direct product of one of the two non- 
abelian groups of order 8 and an arbitrary abelian group of odd order, and 
all such direct products have the property that the number of the non- 
invariant operators contained in each of them is 3g/4, where g is the order 
of the group. 

If G contains exactly 3g/4 non-invariant operators then it contains three 
abelian subgroups of order 2", m>1, which have 2™—! operators in com- 
mon and these common operators are contained in the central of G while 
the Sylow subgroup whose order is a power of 2 is of order 2"*!. The com- 
mutator subgroup of order 2 of G appears in the central of G but it is not 
necessarily generated by an operator of higher order in G as results from a 
non-abelian group of order 16 in which twd of the three abelian subgroups 
of order 8 are of type 2, 1 and the central is the four group. Two of the 
three operators of order 2 in this four group are squares of operators of 
order 4 contained therein while the third is a commutator of the group. 
When the number of the non-invariant operators of a group is 3g/4 the 
smallest number of these operators is 6 since it must be an even number 
and the two non-abelian groups of order 8 are the only groups which have 
separately exactly 6 non-invariant operators. 

Every group of order g which has the property that exactly 3g/4 of its 
operators are non-invariant contains a multiple of 6 non-invariant opera- 
tors and it is possible to construct such a group in which the total number 
of non-invariant operators is an arbitrary multiple of 6. This results 
directly from the fact that in the direct product of a group which involves 
exactly 6 such operators and an abelian group of arbitrary order the 
number of the non-invariant operators is equal to 6 times this arbitrary 








~~ mie me a 


wo fc& s&s oe & & 


low 





—— eS 








VoL. 30, 1944 MATHEMATICS: G. A. MILLER 27 


order. In particular, the groups which contain exactly 12 non-invariant 
operators are of order 16 and have a central of order 4. There is one and 
only one such group which contains the cyclic subgroup of order 8. The re- 
maining operators of this group transform each operator of the cyclic 
group into its fifth power. There is obviously another such group in which 
the central is the cyclic group of order 4 and there are four such groups in 
which the central is the non-cyclic group of order. 

There is only a finite number of different non-abelian groups which sepa- 
rately have the property that each of them contains the same number 
of non-invariant operators. This number of groups may be zero. The 
fact that there is only a finite number of such different groups is a direct 
consequence of the theorem that there is an upper limit for the order of 
such groups since this order cannot be as large as twice the number of the 
non-invariant operators contained therein. On the contary, the number 
of the different groups which contain separately the same given number 
of non-invariant subgroups is not necessarily limited by this number of 
the non-invariant subgroups contained in these separate groups. In fact, 
it is known that there is an infinite system of groups such that each of 
them contains two and only two non-invariant subgroups. (Cf. these 
Proceedings, 29, 105 (1943).) There is, however, no group which contains 
exactly two non-invariant operators as will be more fully explained. 

If the number of the non-invariant operators of G exceeds 3g/4 it is at 
least 5g/6 because the order of the central quotient group is then at least 6 
and if this order is 6 the central quotient group is the non-cyclic group of 
this order. The direct product of this group and an arbitrary abelian group 
evidently furnishes an infinite system of groups such that the number of the 
non-invariant operators contained in each of these groups is 5g/6, g being 
the order of the group. The number of the non-invariant operators in 
each of these groups is a multiple of 5 and there is at least one group in 
which this number is an arbitrary multiple of 5. The symmetric group of 
order 6 is the only one of these groups which contains exactly 5 non- 
invariant operators and this group is characterized by the fact that it 
contains exactly 5 non-invariant operators. 

A necessary and sufficient condition that there exists a group in which the 
number of non-invariant operators is exactly a given prime number is 
that there exists a group of order p + 1 which contains no invariant operator 
besides the identity. This theorem results from the fact that this prime 
number is equal to the order of the central quotient group diminished by 
unity and that this central quotient group cannot involve any invariant 
operator besides the identity. It therefore results that there exists no 
group in which the number of the non-invariant operators is exactly 3, 7 or 
31, but that there exists at least one non-abelian group in which the num- 
ber of the non-invariant operators is any one of the prime numbers in the 








28 MATHEMATICS: H. BATEMAN Proc. N. A. S. 


following set: 5, 11, 13, 17, 19, 23, 29. It would be a very simple matter 
to extend these sets. 

It is now easy to prove the following theorem: Every group in which 
the total number of non-invariant operators is a prime number contains no 
invariant operatcr besides the identity. In fact, if the total number of the 
non-invariant operators of the group is a prime number the order of the 
central quotient group diminished by unity must be equal to this prime 
number. The central quotient group must be simply isomorphic with 
the group since the number of the non-invariant operators of a group 
is the product of the order of the central, and the order of the central 
quotient group diminished by unity. The converse of this theorem is 
obviously not necessarily true since it is easy to find groups which contain 
no invariant operator besides the identity but in which the number of the 
non-invariant operators is not a prime number. For instance, the di- 
hedral group of order 10 has this property. 

There is one and only one group in which the total number of the non- 
invariant operators is one of the three prime numbers 5, 11, 13 but there are 
two groups in which the number of the non-invariant operators is exactly 
17. One of these is the dihedral group of order 18 and the other is the 
generalized dihedral group of this order. When the number of the non- 
invariant operators contained in G is exactly 23 the symmetric group of 
order 24 is the only one of the 15 groups of this order which contains no 
invariant operator besides the identity and hence this group is character- 
ized by the fact that it contains exactly 23 non-invariant operators. This 
characterization of the symmetric group of degree 4 may therefore be 
added to the numerous known definitions of this well-known group. 


NOTE ON THE FUNCTION F (a, b; c — n; 2) 
By H. BATEMAN* 
NorRMAN BRIDGE LABORATORY OF PuysIcs, CALIFORNIA INSTITUTE OF TECHNOLOGY 


Communicated January 8, 1944 


The generating function 


(-#’-'1—-s+2)7* = > (t"/n!)(1 —-b, n) F(a, b; b — 
et n; 2)|t| <1, |zt| < {1 — | 


may be used to find an estimate of F(a, b; 6 — n; 2) for large positive 
values of . When the point ¢ = 1 — 1/z lies outside the circle ¢ = 1 the 
singularity ¢ = 1 may be used to find an estimate by the method of Dar- 


boux! and the result is 








Si 


Ce 


be 


th 
est 


W: 











VoL. 30, 1944 MATHEMATICS: H. BATEMAN 29 


F(a, b; b — n; 2) ~ 1. 


When c — 6 is an integer the foregoing result indicates that F(a, b; c — n; 
z) ~ 1 under the same restriction on z. This is the result used in previous 
papers* but the case in which the point ¢ = 1 — 1/z lies within the unit 
circle or on its circumference was not considered. These cases can also 
be treated by the method of Darboux. It may be noted, however, that 
when 1 —- 1/z lies inside the unit circle the point 1 — 1/(1 — 2) lies outside 
the circle. Use may then be made of the formula of Barnes*® 


F(a,b; c—n; 2) = AF(a,b; l1+at+d—c+n;1—2)+ 
Bz) -¢+"(1 — 2) ~*-°- "Fl —a,1—b; l1-—a—b+c—n; 1-2) 


where 
A=TI(¢c —n—a — b)I(c — n)/T(c — n — a)I(c — n — D) 
B=T(at+tb—c+n)I(c — n)/T(a)T (0). 


It should be noticed that (1 — a — b +c) — (1 — a) is an integer when 
c — bis an integer so the previous result may be applied. It is probable, 
however, that the result holds without this restriction. The first term is 
0(1) when 7 is large and so in this case we have the estimate 


F(a, b; c —n; 2) wnt t+? -1'1 — ee Cer) a + 


[T'(a)I'(b) sin r(c — n)). 
On this account restrictions are needed for the deduction lim F(—m, 
u—> © 
—r; —n — r; x) = 1. ‘When the point 1 — 1/2 lies on the unit circle 
there are two possible singularities to be taken into coasideration in the 
method of Darboux and so unity should be added to the foregoing expres- 
sion. If, however, R(a + b) < 1 unity is negligible in comparison with the 
term just mentioned aad if R(a + b) > 1 unity is the dominant term. The 
case in which R(a + 6) = 1 is exceptional because then both terms must 
be taken into consideration. 
It is noteworthy that in the case z = 1/» there are two singularities on 
the unit circle. For the function of Mittag-Leffler 


gn(z) = 22F1 — n, 1 — 2; 2; 2) 
2-*(2/, n)F(g + 1,2;1+2-— 2; '/2) 


the condition R(a + 6b) < 1 is satisfied when R(z) < 0 and in this case an 
estimate is 


8n(2) ~ 2-*(2/, n). 
When R(z) < 0 use may be made of the formula 








30 BIOCHEMISTRY: TATUM AND BONNER Proc. N. A. S. 


&n(—2) = (—)"g,(2), 
so in this case 
£n(2) si! (—)*2*(—2/, n) ue 2"(z, n)/n. 
For the function 


£n(2; r) a5 (—)*/, n)F(—n, ae 2 Es 2) 
= (2+ 7/,n)2“F2,r+2+1;r+2—n+1; 1/2) 


the condition R(a + b) < 1 is satisfied when R(2z + r) < 0 and so the esti- 


mate 
n(z, 7) ~ (2 + 1/, n)2-* 


holds under this condition. The equation 
&n(—2 — 7,7) = (—)"galz, 7) 
indicates that when R(2z + r) > 0 the estimate is 
&n(2, 7) ~ (—)"2? +"(—2/, m) = 2? +1(2, n)/n!. 
When R(2z + r) = 0 the proper estimate is 
gn(z, 7) = (2 + r/, n)2-* + 2% + (2, n)/n! 
and in the case of the polynomial of Mittag-Leffler, when R(z) = 0 
£n(Z) ~ 2-*(2/, n) + 2%(z, n)/n! 


* For list of Errata to previous articles, see page 44, infra. 

1 Darboux, G., Jour. de Math., (3) 6, 1-56, 377-416 (1878). 

2 Bateman, H., Proc. Nat. Acad. Sci., 26, 491-496 (1940); 28, 371-374 (1942). 
3 Barnes, E. W., Proc. London Math. Soc., (2) 6, 157 (1908). 


INDOLE AND SERINE IN THE BIOSYNTHESIS AND 
BREAKDOWN OF TRYPTOPHANE* 


By E. L. Tatum AND Davip BONNER 
DEPARTMENT OF BIOLOGY, STANFORD UNIVERSITY 
Communicated January 31, 1944 


Indole and anthranilic acid have been suggested as intermediates in the 
biosynthesis of tryptophane by certain bacteria. * Among the mutant 
strains of Neurospora, produced as described elsewhere,’ two types of 
tryptophane deficient strains have been found. One strain will grow in 
the presence of indole, and the other in the presence of either indole or 








th 
tes 


por 





— =-_ ~~ 





VoL. 30, 1944 BIOCHEMISTRY: TATUM AND BONNER 31 


anthranilic acid. With the use of these mutants it has been shown that 
both of these compounds are intermediates in the synthesis of tryptophane 
by Neurospora and that this synthesis has been blocked before anthranilic 
acid in one mutant and between anthranilic acid and indole in the other 
(10575).* This paper deals with the nature of the reaction connecting 
indole with tryptophane. Certain of the results with Neurospora have been 
included in a preliminary note.® 

Synthesis of Tryptophane by Neurospora.—Tests of a number of possible 
intermediates between indole and tryptophane, including skatole, indole- 
acetic acid, indolepropionic acid, 1-kynurenine, tryptamine, indole-lactic 
acid, and indolepyruvic acid (sterilized by filtration) showed that none was 
active in promoting growth of the tryptophane deficient strain 10575. 
These inactive compounds cannot be converted to either indole or trypto- 
phane, and presumably are not intermediates in the synthesis of trypto- 
phane by Neurespora. Since these substances include the theoretically 
possible intermediates in a stepwise synthesis of tryptophane from in- 
dole, it seemed, therefore, that this synthesis might involve a single re- 
action. 

In an attempt to determine the nature of this step in the synthesis of 
tryptophane from indole, a number of experiments were carried out both 
with actively growing cultures of Neurospora and with mycelium in non- 
nutrient media. Indole was determined by treating samples of culture 
media with p-dimethylaminobenzaldehyde essentially as described by 
Happold and Hoyle,* and measuring the color developed with a photo- 
electric colorimeter. The disappearance of indole and the effects of various 
substances on this disappearance were followed by this method. The 
experimental results may be summarized as follows. Indole disappears 
from the culture solutions rather slowly. The addition of serine causes 
the indole to disappear much more rapidly. The action of serine is specific: 
a or 6 alanine, pyruvic acid, glyceraldehyde, phosphoglyceric acid, gly- 
colic acid, threonine, Cori-ester and sucrose are without effect. /(—)- 
serine’ has twice the activity of di-serine. When serine is limiting, both 
the rate of disappearance of indole and the amount of indole used are 
functions of the serine concentration as shown by representative data 
presented in figure 1. Results essentially similar to these have been ob- 
tained with the normal wild-type strain of Neuorspora. It is therefore 
clear that | (—)serine is involved in the utilization of indole by Neuro- 
Spora. 

Since Neurospora grown on indole has been found to contain trypto- 
phane, it seemed possible that this synthesis might be taking place under 
the experimental conditions used in this investigation. Colorimetric 
tests for tryptophane in the solutions after disappearance of indole sup- 
ported this interpretation. .Further experiments were therefore under- 








32 BIOCHEMISTRY: TATUM AND BONNER Proc. N. A. S. 


taken. Cultures of the wild-type and mutant strains were grown with 
constant shaking. After 3 days mycelia representing about 1 gm. dry 
material were washed and incubated with 50 mg. indole and 500 mg. serine 
in 1000 ml. of distilled water. After 48 hours at 25°C. with constant 
shaking the indole had disappeared. Tryptophane was then determined 
in the medium essentially as described by Woods,’ but with a photoelectric 
colorimeter. The results indicated that tryptophane was present in yields 
of 30 to 50 per cent of theory, while the indole content of control cultures 
without serine had decreased very little, and no tryptophane could be 


eo a 
2 MGM. 2 (-) SERINE 5 at —_—- Seen | 
¢ ) ce) 


: en. a 


0.5 MGM. (-) SERINE 
© 


Y AS, ao : 


80 


6 
oo nee ae 


Pr ee 
oa an O° WO SERINE 
« 


e 
20} 
e 
i 
4 


TIME IN HOURS 


40 


PERCENT OF INDOLE TAKEN UP 








FIGURE 1 
The effect of serine on the uptake of indole by Neurospora mycelium. Each 125-ml. 
culture flask contained 1 mg. indole and ca. 50 mg. of 3-day-old washed mycelium of 
strain 10575 in 50 ml. distilled water. The flasks were stoppered and incubated at 


25°C. with constant shaking. 


detected. The results of biological assay for tryptophane on the indole- 
serine solutions using strain 10575 agreed well with the results of the 
colorimeter assay. This shows that the product is predominantly | (—)- 
tryptophane, since the d(+) form is inactive for Neurospora strain 10575. 
In other control cultures to which | (—)tryptophane and serine had been 
added, about 40 per cent of the tryptophane remained after 48 hours’ 
incubation. It therefore seems evident that all of the indole was converted 
into 1 (—) tryptophane, but that 50 to 70 per cent of the tryptophane was 
broken down during the incubation period. One of the products of this 
decomposition of tryptophane was found to be indole-acetic acid. Indole- 








lo 
ac 


do 
ch 
fai 
ser 
ser 
suc 








VoL. 30, 1944 BIOCHEMISTRY: TATUM AND BONNER 33 


acetic acid is produced and excreted by Neurospora under the conditions 
of the experiments only in the presence of tryptophane or of a mixture of 
indole and serine. The postulation of the synthesis of tryptophane in a 
reaction involving indole and serine is further supported by the observation 
that in growth experiments Neurospora will tolerate higher concentrations 
of indole in the presence of added serine. Tryptophane is much less toxic 
than indole. 

Although the evidence strongly indicated that indole reacts with serine 
to give tryptophane, it is possible that neither the colorimetric nor the 
biological assays were measuring tryptophane specifically. Final proof 
was obtained by the actual isolation of tryptophane from the incubation 
mixture. 

A solution which by colorimetric and biological assays contained about 
40 wg. of tryptophane was concentrated in vacuum, precipitated with 
HgSO,, the precipitate freed of mercury by treatment with Ba(OH), and 
HS, and the H:S removed in vacuum. The solution was adjusted to pH 
6 with Ba(OH),2 and H2SO,, and concentrated under reduced pressure. It 
proved difficult to crystallize the product from this solution and therefore 
it was treated with acetic anhydride as described by Berg, Rose and 
Marvel.!° This procedure leads to the racemization of tryptophane with 
the production of acetyl dl-tryptophane.'! The acetylated product was 
extracted from acid solution with ether and the ether removed. Crystalli- 
zation from this solution was difficult until it was found that very small 
amounts of indole inhibited the crystallization of acetyl-tryptophane. 
After treating the solution with a little Norite the product crystallized 
easily from water. Approximately 15 mg. of crystalline material were ob- 
tained. After purification by recrystallization this product gave the cor- 
rect melting point for acetyl-d/-tryptophane (204-205°C.) and showed 
no depression when mixed with authentic acetyl-d/-tryptophane. Analy- 
sis!* gave the following results: Calculated for Cj3HisN203: 63.10% C; 
6.12% H; 11.34% N. Found: 63.74% C; 5.94% H; 11.73% N. The 
isolated material therefore was acetyl-dl-tryptophane, and the product of 
the reaction between indole and serine was tryptophane. Since the colori- 
metric and biological assay values agreed this presumably was the bio- 
logically active 1(—)isomer. While it is possible that serine does not 
actually enter into the reaction but merely permits the accumulation of 
tryptophane after its formation from indole by some other reaction, this 
does not seem likely. The structural relationship of serine to the side 
chain of tryptophane, the specificity of serine in the reaction, and the 
failure to detect tryptophane in an incubation mixture with indole or 
serine alone make it probable that in Neurospora indole reacts with | (—)- 
serine to give /(—)tryptophane. One possibility is that an intermediate 
such as a-aminoacrylic acid'* is involved. Since only | (—)serine is active 








34 BIOCHEMISTRY: TATUM AND BONNER Proc. N. A. S. 


serine could react directly with indole with the elimination of water as 
follows: 

H 

| . 
Cc 


wf 
H—C C—C—H + HOCH:—CH(NH:)—COOH 


—H;0 
ee 


Indole Serine 
H 
| 
& 


a 
H—C  entieeis 
| 
H—C C C—H 
YY 
C N 


igs 
H H 


Tryptophane 


Since all possible intermediates tested failed to support growth of mutant 
strain 10575, it seems probable that the condensation of indole and serine 
is involved in the normal synthesis of tryptophane by Neurospora. 

Production of Indole from Tryptophane by Escherichia Colt.—The forma- 
tion of indole from tryptophane by bacteria has been studied by a number 
of investigators.*% 1415 In this reaction no suspected intermediates 
tested give rise to indole directly. Indolelactic and indolepyruvic acids 
according to Woods may give indole, but must first be converted to trypto- 
phane.’ E£. coli resembles Neurospora in that skatole, indoleacetic acid, 
indolepropionic acid, 1-kynurenine and tryptamine cannot be converted 
into tryptophane. 

It seemed possible that the bacterial production of indole from trypto- 
phane might be a simple reversal of the synthetic reaction carried on by 
Neurospora. If this were true it should be possible to modify the produc- 
tion of indole through the addition of serine. 

Sterile-washed suspensions of E. coli (K-12)'* were prepared as described 
by Woods.’ The cell suspensions were incubated with tryptophane in 
flasks on a shaking machine and indole determined colorimetrically at 
intervals on aliquot samples. Corrections were made for the turbidity 
due to the cells. 

The results of a typical experiment are shown in figure 2. The addition 
of di-serine greatly slows down the production of indole. As with Neuro- 














VoL. 30, 1944 BIOCHEMISTRY: TATUM AND BONNER 35 


spora, the effect of serine is specific, and is a function of the serine concen- 
tration. In our experiments glucose and a-alanine have no influence on 
the reaction. Evans, ef al., found that of twelve amino acids (not including 
serine) tested in growth experiments, the production of indole from trypto- 
phane was effected only by phenylalanine and tyrosine, which two amino 
acids, however, had no influence on the production of indole by active cell 
suspensions.'* We have also performed a few experiments in which indole 
alone, and indole with serine were incubated with coli suspensions. The 
results showed clearly that serine has an effect on the disappearance of 




















0 
0.6} 
NO SERINE 
IMGM. ll SERINE ° 
0.5} 
g 
3 
S O4- e 
& 5 MGM. di- SERINE 
s 
3 
2 ost 
s 10 MGM. dl-SERINE__—© 
= 
9 
= 
0.2 
25 MGM. di- SERINE 
O° os 
O./ + 0 
50 MGM. di- SERINE 
(~} & 
a Or GF —— 
e i iL L iL 
0 5 10 15 20 22 


TIME IN HOURS 


FIGURE 2 
The effect of di-serine on the production of indole from tryptophane by E. coli sus- 
pensions. Each 125-ml. culture flask contained 2 mg. /-tryptophane and 4 ml. of a 
suspension of washed E. coli cells (ca. 20 mg.) in 50 ml. of 0.05 M phosphate buffer at 
pH 7.2. The flasks were plugged with cotton and incubated at 25°C. with constant 
shaking. 


indole similar to that in Neurospora. Colorimetric tests for tryptophane 
after 24 hours’ incubation indicated that some tryptophane had been syn- 
thesized from indole in the presence of serine. 

These results support the view that the production of indole from trypto- 
phane by E. coli actually involves a reversal of the reaction postulated for 
Neurospora. In this case indole and serine would be formed directly from 
tryptophane and the serine further oxidized, probably through pyruvic 
acid.4? This oxidation of serine would account for the end-products of the 
reaction and the O, consumption which were observed by Woods.* 








36 BIOCHEMISTRY: TATUM AND BONNER Proc. N. A. S. 


Krebs, Hafez and Eggleston have suggested that E. coli produces 
indole from tryptophane by way of kynurenine and o-aminophenylacet- 
aldehyde. In a direct test our active coli suspension failed to produce 
indole from 1-kynurenine. The production of indole from o-aminophenyl- 
ethanol in the experiments of Krebs, et a/., indicates only that the bacteria 
can oxidize the alcohol to the aldehyde. These postulated intermediates 
are probably not concerned in the normal production of indole from trypto- 
phane, since our evidence supports the simpler hypothesis involving the 
direct production of indole and serine. 

Discussion.—The 8-hydroxyl group in serine apparently makes this 
amino acid quite reactive biologically. Binkley and du Vigneaud’’ 
have described the réle of serine in the biosynthesis of cysteine from homo- 
cysteine. Chargaff and Sprinson'* have studied the oxidation of serine 
by a number of bacteria including E. coli, and have suggested that the 
first reaction is an intramolecular dehydration with the formation of a- 
aminoacrylic acid. This reaction is analogous to the proposed inter- 
molecular dehydration involved in the condensation of indole with serine. 

The relation between indole and serine is not necessarily as simple as 
that proposed. The main direction of the reaction seems to be different 
in the two organisms used. Indole production is predominant with coli, 
but the reaction apparently is reversible. With Neurospora the synthetic 
reaction goes well, but it has so far been impossible to demonstrate the 
production of indole from tryptophane. However, analogous directional 
specificities are known in other reversible biological reactions such as the 
oxidation and reduction of inorganic nitrogen or sulfur compounds by 
bacteria. 

The biosynthesis of tryptophane in Neurospora and probably in E. colt 
takes place by reactions which do not include the conventional reductive 
amination of the keto acid analog. This may be true of the synthesis of 
tryptophane in other organisms and possibly of the biosyntheses of certain 
other amino acids. 

Summary.—The biosynthesis of tryptophane by the mold Neurospora 
crassa takes place through a direct reaction between indole and serine, 
possibly an intermolecular dehydration. 

A reversal of this reaction is involved in the production of indole from 
tryptophane by the bacterium Escherichia colt. 


* Work supported by grants from the Rockefeller Foundation. 

1 Fildes, P., Brit. Jour. Exptl. Path., 22, 293 (1941). 

2 Snell, E. E., Arch. Biochem., 2, 389 (1948). 

3 Beadle, G. W., and Tatum, E. L., Proc. Nat. Acad. Sci., 27, 499 (1941). 
4 Tatum, E. L., Bonner, D., and Beadle, G. W., Arch. Biochem. (in press). 
5 Tatum, E. L., and Bonner, D., Jour. Biol. Chem., 151, 349 (1948). 

6 Happold, F. C., and Hoyle, L., Biochem. Jour., 28, 1171 (1934). 








a Fo mb oe 2 


tl 


ex 


th 








VoL. 30, 1944 PATHOLOGY: WILSON AND WORCESTER 37 


7 Generously made available by Dr. Max Bergmann, Rockefeller Institute, New York. 

8 Woods, D. D., Biochem. Jour., 29, 640 (1935). 

® Determined by the pea test, see van Overbeek, J., and Went, F. W., Bot. Gaz., 99, 22 
(1937). 

10 Berg, C. P., Rose, W. C., and Marvel, C. S., Jour. Biol. Chem., 85, 207 (1929-1930). 

11 du Vigneaud, V., and Sealock, R. R., [bid., 96, 511 (1932). 

12 Microanalysis by Dr. A. J. Haagen-Smit, California Institute of Technology. 

138 Chargaff, E., and Sprinson, D. B., Jour. Biol. Chem., 151, 273 (1943). 

14 Evans, W. C., Handley, W. C. R., and Happold, F. C., Biochem. Jour., 35, 207 
(1941). 

1 Krebs, H. A., Hafez, M. M., and Eggleston, L. V., [bid., 36, 306 (1942). 

16 Culture obtained from Dr. C. E. Clifton, Department of Bacteriology, Stanford 
University. 

17 Binkley, F., and du Vigneaud, V., Jour. Biol. Chem., 144, 507 (1942), 


A SECOND APPROXIMATION TO SOPER’S EPIDEMIC CURVE 
By EpwIN B. WILSON AND JANE WORCESTER 


HARVARD SCHOOL OF PUBLIC HEALTH 


Communicated January 20, 1944 


If + be the latent period between infection and infectiousness and o be 
the period of infectiousness, the equation which expresses Soper’s formula- 
tion of the course of an epidemic is! 


A ~S = rS[do — St — 2) + SU - 7 — 0)], (1) 


where A is the rate at which new susceptibles are recruited to the popula- 
tion, S is the number of susceptibles at time ¢, and r is a rate of contact 
between the infectious and the susceptibles. The “law of mass action” is 
assumed, namely, that the rate at which the susceptibles are infected is 
proportional jointly to the number of susceptibles and the number of the 
infectious. If the right-hand member of (1) is expanded by Taylor’s 


Theorem about the value t = —(r + o/2) we have 
ds’ dS | o d’S . 
-—= A-— --—-— oe eee 
A dt ros| dt t—r-—o/2 24 dt® t—7r-—¢/2 | ( ) 








Soper introduces a number of additional assumptions, of which one is that 
the time of infectiousness ¢ is so short that all the derivatives of higher 
order than the first may be disregarded.” For the steady state, could one 
exist, where S is constant and equal to m, equation (1) shows that raS = 1, 
that is, yom = 1. In that state one infectious person would infect just one 








38 PATHOLOGY: WILSON AND WORCESTER Proc. N. A. S. 


susceptible. We may express S relative to this number m by introducing 
x = S/m. The equation then becomes 


A & A dx) 
mode poole 3 

m dt [4 Bhiccad (3) 
The period + + o/2 is Soper’s “incubation period.” If it be introduced as 
the unit of time, ¢ must be replaced by 7(r + o/2), and if the rate of re- 
cruits A be henceforth defined as the number of recruits in this unit of 
time, equation (3) will maintain the same form in 7, viz., 


4-%=2/4-4 it (3’) 
T-1 


m dT 
We shall introduce the rate of new infections z measured relative to m and 
its logarithm, namely, 


Te « pete 





dx 
dT’ 
in place of x. Then on differentiating (3’) and shifting the origin of time 
by a unit we have 

du | _ du 

adT\|r+1 dT 
which is the equation in u which corresponds rigorously to Soper’s assump- 
tions. By methods of approximation he arrives at the equation 


L 


whe u = logz 
m ee 


. da ack E pa aoe a oe (4) 








The rigorous derivation given above shows that he has effectively replaced 
on the left-hand side of the equation 
_ du ,ildu, 1d 


, df 247 6aT 


du _ du 
dT IT +1 dT 
by the first term of its expansion and on the right-hand side has replaced 
e“(7) — “(r + 1) by 1 as though in this expression u(7T) and u(T + 1) did 
not differ eppreciably. 
It is our purpose to examine, for the simple case where the epidemic is 
sufficiently short so that the number of recruits does not significantly affect 
its course, the integrals of 


ae 





d7u du du | 
—. = —e" d of — -=| = -&™, 4” 
aT? eo eae ee (4°) 


The integral of the first is readily found as 








ee ES ee ee ee ey 


i 








Vot. 30, 1944 PATHOLOGY: WILSON AND WORCESTER 39 


u = 2 log sech V/m/2T + log, 2% = msech* W/m/2T, (5) 


where % is the rate of new cases* at the peak of the epidemic which is taken 
as the origin of time. With A = 0, 2 = —dx/dT and, if we integrate z, 
we have 


x = const. — +/2z tanh 1/%/2T. (6) 


The constant may be determined if we know the value of x when T = 0, 
i.e., at the peak of the epidemic. The fundamental relations between C or 
z and S or x lead, however, in case A = 0, to the equation‘ 


(T) = 2(T) .: sech?/ 2) /2T 
eT C= 2D eae - 





(6’) 


It is readily seen that (6) and (6’) are not equivalent no matter what value 
be assumed for the constant of integration. The discrepancy between 
(6) and (6’) cannot be avoided as it is inherent in the approximate nature of 
the solution in (6) for (4’). If we use (6’) we find that when T = 0, the 
value of x is cosh? +/2,/2 and this value might be used for the constant in 
(6). On the other hand if we observe from (6’) that for T = '/.,x = 1, we 
should obtain the value of the constant in (6) as 1 + +/2)/2 tanh +/2,/8 
which is not equal to cosh? +/z,/2 except as both may be considered equal to 
1+ 2/2. Indeed if we substitute the approximation (5) in (4”), it is by no 
means clear that the approximation is good. Soper frankly made the hy- 
pothesis that the epidemics were so moderate that terms of higher order 
could be neglected’; it would follow that the discrepancies which we have 
been citing would have to be considered as inconsequential. Our aim is 
precisely to see whether a better approximation may not be found, at least 
in the simple case where the rate of recruits A may be neglected. 

To discuss approximations one has to have some estimate of the values 
of 2.—how large are they? We cannot obtain % by dividing reported cases 
at the peak during an incubation period by the value m, because in the 
first place the cases reported are only a fraction of the true cases and in the 
second place we have at best only crude estimates of the number m of 
susceptibles just sufficient for one case to make one new case. If we take 
a few clear-cut epidemics of measles we may determine the constants of 
the expression z = A sech? BT, which is the first approximation to Soper’s 
epidemic curve, either by a cumulative plot on growth paper or by a calcu- 
lation based upon moments® using 





<onma X total cases, (7) 


Tv 
B=-—-—, A= 
20/36 4+/30 


where ¢ is the standard deviation of the time measured in fortnights (which 








40 PATHOLOGY: WILSON AND WORCESTER Proc. N. A. S. 


Soper takes to be the incubation period of measles). On the graph on 
growth paper’ the value of the constant B is 2.945 divided by the elapsed 
time in incubation periods from the 5% to the 95% points on the fitted 
straight line. The value of B is independent of the fraction of cases re- 
ported, provided only that the fraction does not change during the course 
of the epidemic, whether it be determined graphically or by moments; 
and as, in the approximate solution, B = +/z/2, the observed value of B 
gives a determination of 2 in so far as the theory is valid. For a number of 
epidemics of measles we find (Fig. 1) values of B ranging® from about 0.20 
to about 0.60. 





es ow Oe ee oe see 


T T T T : © T T ae T = rime 
991%) : 7 - 


95(%); 
| 
750+ 
50 (%)r 
25(%)>r 
VANCOUVER 1931-32 BERKELEY 1938-39 PROVIDENCE 1934-35 
B= 56 = bei A 


5 (%) 
| / 
vy MEE NORE Te OT Oe OPE ee Ee ea ae ee ee 
MONTHS 
FIGURE 1 
Cumulative graphs on growth paper of epidemics of measles in Vancouver, B. C., 


1931-1932; Berkeley, Calif., 1938-1939; Providence, R. I., 1934-1935; and Oakland, 
Calif., 1933-1934. 








OAKLAND 1933-34 














The exact equation for the epidemic curve with A = 0, viz. 


du) _ a 
dT \r+1 dT |r 


gives the difference of the slopes a unit time apart. If it be considered that 
the values u and a = du/dT at any time T be known, the value of u at 
T + 1 upon the tangent to the curve will be um + a; the tangent to the 
curve at T + 1 is, however, a — e“ and if we had proceeded from T to 
T + 1 upon a line of this slope we should have reached a value up + a -- 
e”. If we assume that a better value than either of these is their mean,° 
we have for the new value 


oe et), (8) 








co - 4 Hm was 


bi 
If 





VoL. 30, 1944 PATHOLOGY: WILSON AND WORCESTER 41 


d 
AT + Y=) +o) = 2e™. (9) 
dT |r 
With this relationship one can compute successive values of u stepwise from 
any assumed initial conditions u = up and du/dT = a. To go backward 
stepwise it is necessary to solve for u(T — 1) the equation” 


du| ps 1 u(T—1)_ 


u(T — 1) = aT) — Fal hg 


(9’) 
In this manner we may plot out u as a function of the time for positive 
and negative integral values of T starting from any initial conditions, and 
may fill in intermediate values if we desire by using the parabola, 


du 1 
= 9 — — ete 
u(T + 6) u(T) + orl 5° 62, 


where @ is a positive proper fraction. A second method of following the 
epidemic curve would be to proceed by undetermined coefficients, writing 


“= um + a(T — To) + B(T — To)? + ¥(T — TM)? + ee 


and determining 8, y, ... in terms of the initial conditions! mu, @ by sub- 
stitution in (8). Terms up to and including that in (T — 7 )> seem to give 
an adequate solution to advance or retreat one time unit at a time when 
near the maximum and two or more units when fairly well removed from 
the maximum. 

A different method of attack is to proceed directly to modify the equa- 


tion d’u/dT? = —e", which is the first approximation, over into one which 
is a better representation of (8). We have 
du | du | d*u 1 d*u 1 d*u 
ec sill = ae ee a =- = .. = —e. 11 
dT\r+1 adTlr dl 247 Gar dc Maes 
If we use the approximation d’u/dT? = —e* to find d*u/dT*, we get as an 
equation of the second order 
d*u _ 1 du se 
dT? 2 dT 


This equation may be integrated in parametric fashion’; but it turns out 
that a much more convenient, and presumably better, approximation may 
be had by carrying the process one stage further on the original equation. 
If 


du 1 d*u 


u 


dT 3dT* 


d*u 1 d®u 


1du _ “ 1 d‘u 


am t3arn. ° gam 


1 
= € 
3 


and we may write, on substitution in (11), 





42 PATHOLOGY: WILSON AND WORCESTER Proc. N. A. S. 


Gu , ide _ 1 du = — e 
g* a7". 8 ; 


The value of d*u/dT*, which would be obtained from the equation by dif- 
ferentiating the equation after neglecting the third derivative, may be ob- 
tained and substituted in the equation. The result is’* 


du ew e” du e" ( du\? " 
(+i) -Sartdan) oe 08 


If p = du/dT, the first integral with p = 0 when u = wis 


18 + e” 18 + e* 
2 _ — nia ite = 
p?+ 12 = = 1] 36[ = 1] 0. 


This equation may be solved for p and integrated in the form 
1 1 
T = ; log (2/2) = 3 V1 + 18/z% tanh! +1 —2/%, (13) 


where % is the case rate at the peak of the epidemic and T is measured from 
that time, the + sign is used for 7 > 0 and the — sign for T < 0. Equa- 
tion (12) may be solved by an iterative process as 


log 2 = 21s {ar a * | ar —<ts (aT — etc.) |}, (14) 
20 3 3 


where Is stands for log sech and a for 3/+/1 + 18/2. 


TABLE 1 


TABLE OF VALUES OF THE CASE RATE FOR SOME VALUES OF T 


T By (5) BY (9) By (10) By (14) 
0 0.300 0.300 0.300 0.300 
1 0.259 0.258 0.258 0.258 
—1 0.259 0.263 0.262 0.261 
2 0.174 0.168 0.167 0.166 
-2 0.174 0.184 0.188 0.182 
3 0.098 0.088 0.087 0.087 
—3 0.098 0.112 0.110 0.110 
4 0.050 0.041 0.040 0.040 
—4 0.050 0.062 0.061 0.061 
8 0.002 0.001 Not computed 0.001 
—8 0.002 0.004 Not computed 0.004 


The values of z, when the initial conditions are % = 0.3 at maximum 
when J = 0, for the four approximations (5), (9), (10) and (14) are tabu- 
lated for some values of T (table 1). It is seen that the approximations 
(9), (10), (14) are very much alike and that they give an asymmetrical 
curve which differs considerably from the symmetrical curve (5). The 








SS —— ae ae 


+- - —~ 


AQ)}aQ 





VoL. 30, 1944 PATHOLOGY: WILSON AND WORCESTER 43 


skewness of the curve is negative, the rise to the peak being slower than the 
fall from it. For the particular data tabulated this skewness, as measured 
by the ratio of the third moment to the cube of the standard deviation, is 
about —0.30. The formula (13) or (14) represents the epidemic curve that 
follows from Soper’s theory to a much higher degree of approximation than 
(5) and, we believe, to a quite sufficient accuracy for values of z) up to 
0.4, that is, for epidemics so severe that at their peak the number of new 
cases per incubation period is about four-tenths of the number of suscep- 
tibles just sufficient for each old case to generate one new case. 


1 Wilson, E. B., and Burke, Mary H., these PROCEEDINGS, 28, 361-367 (1942), foot- 
note 2. The reference to Soper, H. E., is Jour. Roy. Statist. Soc. (London), 92, 34-61 
(1929). If the rate of new infections is C(#) and the number of infectious persons is I(t) 
the equation —dS/dt = C(t) — A(¢) implies that none of the persons who become infected 
return to the population of susceptibles unless allowance therefor is made in the rate of 


t—rT 
recruits A. The ‘‘law of mass action” is C = rJS with] = C(t)dt. The rate 
t—-r-¢ 

A is taken as constant. 

2 We keep a /2 in the expression + + o/2 because it is possible that « be small enough 
so that the higher derivatives in (2) may be neglected and yet large enough to make r + 
o/2asomewhat more accurate ‘‘incubation period” or period between generations of the 
infected than r. 

3 The rate of new cases 2(7) = C(T)/m is the rate at which susceptibles become in- 
fected, not the rate at which persons become clinical cases, which is presumably more 
nearly equivalent to 2(T + 1). 


t—r 
4 For, when a is small J = C(t)dt = «C(t — r —o/2) =oC(T — 1) and C= 
Ji~—-r—¢ 
rIS becomes C = raSC(T — 1) which on division by m with rom = 1 leads to 2(T) = 
x2(T — 1). ; 

5 It should be remarked that Soper’s main interest was in periodicity, which on the 
present theory requires that the rate of recruits A should not be zero, and which brings 
in additional difficulties; we have no desire to minimize Soper’s accomplishments in his 
paper. 

6 When using moments it is particularly desirable that the epidemics rise from 0 and 
drop to 0 in a clear-cut fashion because irregularities at the ends may have a considerable 
effect on the moments; effects due to irregular beginning and ending can be more readily 
disregarded when using the graphical method. 

7 Wilson, E. B., these PROCEEDINGS, 11, 451-456 (1925). 

8 If we use the curve A sech? BT to describe the epidemic empirically (or any similar 
bell-shaped curve), the values of A and B may be regarded as determinable independently 
from the data, but as the integral of the approximate differential equation is z = 
2 sech*+/ z)/2T, it follows that A and B are connected by the relation A = 2B? and that 
total cases relative to m are 2A/B or 4B. The range of total cases would therefore be 
from about 0.80m to about 2.40m in these epidemics in so far as the theory is valid. 

9 This is equivalent to fitting a parabola. For ifu = a + bT + cT?, 


" du = b+ UF du du du 
aT ls eee ree es 


give the result (9) for u(T +1) sa +607 +cT? +64 2cT + 0. 


= b + MT + 2c, =2c= —e%(T) 








T 








44 PATHOLOGY: WILSON AND WORCESTER Proc. N. A. S. 


1 
10 If we first neglect or — Dwe find u(T — 1) = u(T) — (du/dT)r which may be 


: mee | = = : ; 
substituted in = e“(T —)) and soon. This iterative process converges rather rapidly. 


- 


11 We find 


gn — Sat = 60 + 12) ial 
faa Ce 


(ae“ — 3e — 6a) 
12 + e” 








5 = 2 (26 - a8 — 8 ir ee 4 - -<( 1g B+ ) 
~ 94 . oe ee ae are 


12 Ans.: 


u =oe| e+ 410e(1-29) +20 | 
? wana 
r= [ &p ' 
P (1-3 ) mo +eanmeli~2 +2 
9? og 3? Pp 


13 Various other equations may be obtained by similar methods of approximation but 
this one seems to be the one which gives the least inconvenient integrals. 
14 The convergence of this iterative formula is fairly rapid; if one knows in advance 





1 
an approximate value for 5 log (z/z) at time T and takes this value as Ils (aT — etc.), 


the convergence will be more rapid than if one starts with ‘‘— etc.” as zero and computes 
ls aT as the first step. 


ERRATA 


In Proc. Nat. Acad. Sci., 28, 374-377 (1942), the following corrections 
are needed: 

In (1.1) sh(x) should be sh(rx). 

In (1.3) the factor '/: should be (1/27). 

In (3.3) a factor mm cosec (m7) should be inserted on the right. 

On account of the last correction the orthogonal relation for E,(x) can 
be found. It is 


S- Enq ix) Eq —ix)dx/sh*(#/ynx) =0 n' #=n,n’' >0,n >0 
et = 2/[xn(n + 1/2)(n + 1)] n'=n>0 


In Proc. Nat. Acad. Sci., 28, 371-374 (1942): 
On p. 371 the equation 
lim S? = 1 
n—> ' 
should be accompanied with the restriction | 1—z"! | ea | 1-—x |= 
1 the condition R(m + r) > 1 should be imposed. 








